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tions and the testing results show excellent overall performance in the prediction of next 24 
hours output power in KW reaching a Root Mean Square Error (RMSE) value of 0.0721. 
This research shows that machine learning algorithms hold some promise for the predic- 
tion of power production based on various weather conditions and measures which help in 
the management of energy flows and the optimisation of integrating PV plants into power 
systems. 
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1. INTRODUCTION 

The importance of solar Photovoltaic (PV) systems is increasing with the ongoing industrial growth and 
the increased energy demand for developed and developing countries [1, 2]. Energy production by PV systems is 
becoming one of the main renewable energy sources as it turns the power of the sun into electricity and this can be 
done repeatedly without causing any damage to the environment. 

The term “Photovoltaic” is first used in English since 1849 as the process of light conversion into electricity 
[3]. Solar PV power plants are installed in two modes: grid-connected and a stand-alone (Off-Grid) [4]. Off-Grid 
systems are used for isolated or remote areas that are normally on smaller scale. On the other hand, grid-connected 
systems are widely operated and they are proven to be hugely beneficial but they were known as uncertain systems, 
uncontrollable, and non-scheduling power source [5]. This is because such type of power production depends on the 
variable weather conditions according to the geographical area of the system. 

To maintain a stable power quality and scheduling and improve investment feasability, many studies were 
reported in the literature suggesting different modeling, simulation, and prediction methods for the expected power 
production of solar PV plants [6, 7]. In [8], the accuracy of one-day ahead prediction for the power produced by [MW 
PV System is compared for two methods, Support Vector Machines (SVM) and Multilayer Perceptron (MP) Artificial 
Neural Networks (ANNs). It was found that the two algorithms approximately obtained almost the same accuracy with 
0.07 KWh/m? and 0.11 KWh/m? Mean Absolute Error (MAE) and Root Mean Square Error (RMSE), respectively. 

Various forecasting methods of PV power output were reviewed in [9]. It was demonstrated that any model 
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uses numerically predicted weather data will not take into account the effect of cloud cover and cloud formation when 
initializing, therefore sky imaging and satellite data methods used to predict the PV power output with higher accuracy. 
The article also outlined some key factors affecting the accuracy of prediction, such as forecast horizon, forecasting 
interval width, system size and PV panels mounting method (fixed or tracking). The aim of the work published in [10] 
was to study the effect of forecast horizon on the accuracy of the method used to predict the PV power production, 
which was Support Vector Regression (SVR) using numerically predicted weather data. Two forecast horizons studied: 
up to 2 and 25 hours ahead. As expected, the forecasting of up to 2 hours ahead was more accurate with RMSE and 
MAE increased 13% and 17%, respectively, when the forecast horizon was up to 25 hours ahead. 

The authors of [11] developed and validated a model that adapted an ANN with tapped delay lines and built 
for one day ahead forecasting. The inputs were the irradiation and the sampling hours. The model achieved seasonal 
MAE ranging from 12.2% to 26% in spring and autumn, respectively. 

The research work of [12] compared two short-term forecasting models: the analytical PV power forecasting 
model (APVF) and the MP PV forecasting model (MPVF), with both of the models using numerically predicted 
weather data and past hourly values for PV electric power production. The two models achieved similar results 
(RMSE varying between 11.95% and 12.10%) with forecast horizons covering all daylight hours of one day ahead, 
thus the models demonstrated their applicability for PV electric power prediction. 

Anew Physical Hybrid ANN (PHANN) method was proposed in [13] to improve the accuracy of the standard 
ANN method. The hybrid method is based on ANN and clear sky curves for a PV plant. The PHANN method 
reduced the Normalized MAE (NMAE) and the Weighted MAE (WMAE) by almost 50% in many days compared 
to the standard ANN method. In [14], the PV energy production for the next day with 15-minutes intervals was 
accurately predicted with a SVM model that uses historical data for solar irradiance, ambient temperature and past 
energy production. The method demonstrated very good accuracy with R? correlation coefficients of more than 90%, 
and the coefficient was strongly dependent on the quality of the weather forecast. 

A model using multilayer perceptron-based ANN was proposed in [5] for one day ahead forecasting. The 
daily solar power output and atmospheric temperature for 70 days used for training the ANN. For the different settings 
of the ANN model (number of hidden layers, activation function and learning rule), the minimum MAPE achieved 
was 0.855%. 

In this research work, ANNs were optimized to find the best learning configurations and map the available 
solar irradiance records into the generated solar PV power. The proposed system provides real-time next-day predic- 
tions for the output power based on the knowledge extracted from the available historical data. These predictions can 
be used by many energy management systems [15] and power control systems of grid-tied PV plants [16]. 


2. PV SYSTEMS AND DATA 
The data used in this research were collected from the existing weather station and solar PV plants at Applied 
Science Private University (ASU) as depicted in the map of Figure 1. 


ASU Weather Stations 
> 7 


Imagery ©2017 DigitalGlobe, DigitalGlobe, Map data ©2017 Google, ORION-ME Jordan 


Figure 1. A map showing part of ASU’s campus. 


There are four separate PV systems installed at the university campus for a total generation capacity of 


IJECE Vol. 8, No. 1, February 2018: 497 — 504 


IJECE ISSN: 2088-8708 499 


550KWp: three rooftop mounted solar systems and one ground mounted test field. In this work, the power production 
data extracted from the PV system ASU09 (Faculty of Engineering) [17] is correlated with the solar irradiance mea- 
sured for the same period by the weather station [18] which is located about 175m from the engineering building (see 
Figure 1). 


2.1. PV ASUO9: Faculty of Engineering 


The largest PV system is installed on top of the faculty of engineering building with a capacity of 264K Wp. 
It consists of 14 SMA sunny tripower inverters (17KW and 10KW) connected with Yingli Solar (YL 245P-29b-PC) 
panels that are tilted by 11and oriented 36(S to E). 

The dataset used in this research was created using all reported solar irradiance and PV power records between 
15 May 2015 and 30 September 2017. This consists of 19800 PV power and 20808 weather station records with one 
hour frequency. 


3. THE PROPOSED PREDICTION SYSTEM 
3.1. Preprocessing 


As shown in Figure 2, the first stage of our system is to make sure that all data entries are consistent and 
available for both solar irradiance and PV power per instance of time. 


Weather 

Station Data—»| Filtering and 
Event 

PV Plant Data | Association 


Normalization 


Next Day Decision Machine 
Predictions Rules Learning 


Figure 2. A block diagram for the proposed system. 


A filter was designed to remove out any irradiance record where no PV power value is reported at the same 
time. In addition, many records were not reported correctly because of some network connection disruptions and in 
some cases this was caused by an inverter failure. An irradiance record is associated with a solar PV output power 
value at each hour for a total of 19249 samples as depicted in Figure 3. As shown in Figure 2, the dataset is then 
normalized between 0 and 1 for a better machine learning performance. 


3.2. Artificial Neural Networks 


ANNs is a machine learning algorithm that interconnects non-linear elements through adjustable weights. 
The structure of ANN consists of three layers: input, hidden, and output layers as illustrated in Figure 4 [19]. The 
input layer receives the raw data, and then these inputs are processed in the hidden layer to be finally sent as computed 
information from the output layer [5]. 

Using neural network learning methods provide a robust algorithm to interpret real-world sensor data [20], 
and it has been widely used in the field of solar energy [21]. Artificial intelligence techniques can be used for sizing 
PV systems: stand-alone PVs, grid-connected PV systems, and PV-wind hybrid systems [22]. There are many learning 
algorithms that can be used in our work [23, 24, 25], but it was shown in the literature that ANN systems were proven 
to provide excellent prediction and classification results in similar applications such as [26] and [27]. 


3.3. ANN Experiments and Optimisation 


In this research work, an ANNs network model was created with five inputs representing the solar irradiance 
(Irr) records at the same time of the previous five days that are associated with a current solar PV output power (P) 
which represents the target function (output node). So, if the mean power value for the hour h on day d is represented 
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Figure 3. The associated PV power and irradiance data (0 on the time axis corresponds to 15 May 2015). 


Input layer 


Hidden layer 


Output layer 


Figure 4. The structure of ANN. 


by P, (d), then it is associated with the irradiance values at the same hour h for the previous five days: Irr, (d — 1), 
Irr;,(d — 2), Irra(d — 3), Irra (d — 4), Irr;,(d — 5). 

All training and testing experiments were carried out using the MATLAB ANNs toolbox with the aid of 
the back-propagation learning algorithm [28]. To optimize the model performance, the number of hidden layers was 
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incremented from 1 to 30 and at each value of hidden layers, ten experiments were carried out using a different set of 
randomly mixed samples consisting 80% of the samples (15399 samples) for training, 5% for validation, and 15% for 
testing. The average RMSE for each of ten experiments is calculated to evaluate the performance per specific number 
of hidden layers. 

A total of 300 sets of training, validation, and testing experiments were handled and the best ANN config- 
urations were found to provide an average RMSE of 0.0721 and best validation MSE of 0.0053397 using 22 hidden 
layers for the testing performance illustrated in Figure 5 and Figure 6. These results are very good compared to the 
methods and measures reported in the literature and related to the current research. 


Training: R=0.97063 Validation: R=0.96975 
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Figure 5. Correlation coefficients calculations. 


A two-days prediction for the PV energy production for 23 and 24 May 2015 was simulated using our model 
(see Figure 7 (left)) and the system provided a RMSE=0.0234 and correlation coefficient of R=0.9983 which means 
an almost perfect linear relationship between solar irradiation and the output power generated. In addition a ten-days 
simulation for the duration from 20 to 30 July 2015 provided RMSE=0.0333 and R=0.9965 as illustrated in Figure 7 


(right). 


4. CONCLUSIONS 

In this work, a machine learning model is proposed to analyses historical solar PV output power and solar 
irradiance data to provide a set of decision rules that represent a proper prediction system. All data records in the 
duration from 16 May 2015 to 30 September 2017 were used in this research work and the ANNs-based system 
provided promising results. 

We believe that this work is the first to predict the next-day solar PV output power using real time irradiation 
data measured accurately at a weather station that is located at the same geographical area of the PV plants. 
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Best Validation Performance is 0.0053397 at epoch 54 


Train 
Validation 


Mean Squared Error (mse) 


10° E f ! fi f f : j 
0 10 20 30 40 50 60 
60 Epochs 


Figure 6. ANN experiments using 22 hidden layers. 
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Figure 7. Measured and forecasted PV energy production for 23-24 May 2015 (left) and 20-30 July 2015 (right). 
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